Building a DIY Legal Research Tool with AI

Responsive image

Posted: Sept. 22, 2025

Building a DIY Legal Research Tool with AI

While working on another project, I needed to look up case law related to the Maryland Rules of Professional Conduct. To my surprise, there was no straightforward way to do this without a Westlaw or Lexis subscription. The opinions themselves are public, but figuring out which cases referenced which rules was impossible. That got me thinking: could I replicate Westlaw’s “Citing References” feature, at least for this small corner of case law, using nothing more than my laptop, a bit of Python, and some AI? TLDR: Yes and you can try it out here.

 

Step 1: Collecting the Cases

The Maryland Judiciary makes all Attorney Grievance cases available for download on its website. The challenge: there are 1,481 cases involving the Attorney Grievance Commission, and I had no intention of clicking through them one by one.

Since all the links appear on a single results page, I downloaded the HTML and wrote a Python script to loop through each row of the table. Whenever “Attorney Grievance” appeared as a party, the script extracted the link and automatically downloaded the PDF into a folder on my machine. In minutes, I had my dataset of disciplinary opinions and orders.

Step 2: Extracting the Rules with AI

Now that I had the cases, the real work began. I built another script to process the PDFs one at a time. Each document was sent to OpenAI with a prompt along the lines of:

“Read the attached case and identify which Maryland Rules of Professional Conduct were alleged to have been violated. Return the rules along with whether the allegations were proven, the case number, and whether the document is an opinion or an order.”

The AI returned its answers in JSON, a structured data format that’s far easier to work with than free text. While the model occasionally generated malformed JSON, the majority of the output required little to no correction—a huge timesaver compared to manually reviewing hundreds of opinions.

 

Step 3: Making It Searchable

With a folder full of structured case data, the next step was to make it usable. I imported everything into a database and built a simple interface on my website: Maryland Attorney Grievance Search Tool.

The interface lets users click on a Rule and instantly see all cases discussing it. To make it more useful, I also included summaries of each case and direct links to the original opinions.

One thing was missing, though: the text of the rules themselves. That meant another round of scraping—this time for the PDFs of Title 19 of the Maryland Rules. I sent those to AI to extract just the rule text, added the results to my database, and linked everything together.

 

Reflections on the Process

The project wasn’t without hiccups. AI occasionally produced broken JSON, and I had to do some manual cleanup. But the remarkable thing is how feasible this was. Just five years ago, this project would have taken hundreds of hours of human review. Instead, it cost less than $10 in API credits and a few days of coding.

While the tool is still limited, there’s no full-text search yet, and it only covers grievance cases, it demonstrates how dramatically the landscape of legal tech has shifted. Barriers that once required massive institutional resources have largely disappeared.

For attorneys and technologists alike, projects like this show how AI can democratize access to legal information.

Check out the site at here.

 

Copyright © Matthew Stubenberg